To generate high quality rendering images for real time applications, it is often to trace only a few samples-per-pixel (spp) at a lower resolution and then supersample to the high resolution. Based on the observation that the rendered pixels at a low resolution are typically highly aliased, we present a novel method for neural supersampling based on ray tracing 1/4-spp samples at the high resolution. Our key insight is that the ray-traced samples at the target resolution are accurate and reliable, which makes the supersampling an interpolation problem. We present a mask-reinforced neural network to reconstruct and interpolate high-quality image sequences. First, a novel temporal accumulation network is introduced to compute the correlation between current and previous features to significantly improve their temporal stability. Then a reconstruct network based on a multi-scale U-Net with skip connections is adopted for reconstruction and generation of the desired high-resolution image. Experimental results and comparisons have shown that our proposed method can generate higher quality results of supersampling, without increasing the total number of ray-tracing samples, over current state-of-the-art methods.
translated by 谷歌翻译
Low-dose computed tomography (CT) plays a significant role in reducing the radiation risk in clinical applications. However, lowering the radiation dose will significantly degrade the image quality. With the rapid development and wide application of deep learning, it has brought new directions for the development of low-dose CT imaging algorithms. Therefore, we propose a fully unsupervised one sample diffusion model (OSDM)in projection domain for low-dose CT reconstruction. To extract sufficient prior information from single sample, the Hankel matrix formulation is employed. Besides, the penalized weighted least-squares and total variation are introduced to achieve superior image quality. Specifically, we first train a score-based generative model on one sinogram by extracting a great number of tensors from the structural-Hankel matrix as the network input to capture prior distribution. Then, at the inference stage, the stochastic differential equation solver and data consistency step are performed iteratively to obtain the sinogram data. Finally, the final image is obtained through the filtered back-projection algorithm. The reconstructed results are approaching to the normal-dose counterparts. The results prove that OSDM is practical and effective model for reducing the artifacts and preserving the image quality.
translated by 谷歌翻译
Face Animation是计算机视觉中最热门的主题之一,在生成模型的帮助下取得了有希望的性能。但是,由于复杂的运动变形和复杂的面部细节建模,生成保留身份和光真实图像的身份仍然是一个关键的挑战。为了解决这些问题,我们提出了一个面部神经量渲染(FNEVR)网络,以充分探索在统一框架中2D运动翘曲和3D体积渲染的潜力。在FNEVR中,我们设计了一个3D面积渲染(FVR)模块,以增强图像渲染的面部细节。具体而言,我们首先使用精心设计的体系结构提取3D信息,然后引入一个正交自适应射线采样模块以进行有效的渲染。我们还设计了一个轻巧的姿势编辑器,使FNEVR能够以简单而有效的方式编辑面部姿势。广泛的实验表明,我们的FNEVR在广泛使用的说话头基准上获得了最佳的总体质量和性能。
translated by 谷歌翻译
深层神经网络能够轻松地使用软磁横层(CE)丢失来记住嘈杂的标签。先前的研究试图解决此问题的重点是将噪声损失函数纳入CE损失。但是,记忆问题得到了缓解,但仍然由于非持鲁棒的损失而造成的。为了解决这个问题,我们专注于学习可靠的对比度表示数据,分类器很难记住CE损失下的标签噪声。我们提出了一种新颖的对比正则化函数,以通过标签噪声不主导表示表示的嘈杂数据来学习此类表示。通过理论上研究由提议的正则化功能引起的表示形式,我们揭示了学识渊博的表示形式将信息保留与真实标签和丢弃与损坏标签相关的信息有关的信息。此外,我们的理论结果还表明,学到的表示形式对标签噪声是可靠的。通过基准数据集的实验证明了该方法的有效性。
translated by 谷歌翻译
Representing and synthesizing novel views in real-world dynamic scenes from casual monocular videos is a long-standing problem. Existing solutions typically approach dynamic scenes by applying geometry techniques or utilizing temporal information between several adjacent frames without considering the underlying background distribution in the entire scene or the transmittance over the ray dimension, limiting their performance on static and occlusion areas. Our approach $\textbf{D}$istribution-$\textbf{D}$riven neural radiance fields offers high-quality view synthesis and a 3D solution to $\textbf{D}$etach the background from the entire $\textbf{D}$ynamic scene, which is called $\text{D}^4$NeRF. Specifically, it employs a neural representation to capture the scene distribution in the static background and a 6D-input NeRF to represent dynamic objects, respectively. Each ray sample is given an additional occlusion weight to indicate the transmittance lying in the static and dynamic components. We evaluate $\text{D}^4$NeRF on public dynamic scenes and our urban driving scenes acquired from an autonomous-driving dataset. Extensive experiments demonstrate that our approach outperforms previous methods in rendering texture details and motion areas while also producing a clean static background. Our code will be released at https://github.com/Luciferbobo/D4NeRF.
translated by 谷歌翻译
Forecasts by the European Centre for Medium-Range Weather Forecasts (ECMWF; EC for short) can provide a basis for the establishment of maritime-disaster warning systems, but they contain some systematic biases.The fifth-generation EC atmospheric reanalysis (ERA5) data have high accuracy, but are delayed by about 5 days. To overcome this issue, a spatiotemporal deep-learning method could be used for nonlinear mapping between EC and ERA5 data, which would improve the quality of EC wind forecast data in real time. In this study, we developed the Multi-Task-Double Encoder Trajectory Gated Recurrent Unit (MT-DETrajGRU) model, which uses an improved double-encoder forecaster architecture to model the spatiotemporal sequence of the U and V components of the wind field; we designed a multi-task learning loss function to correct wind speed and wind direction simultaneously using only one model. The study area was the western North Pacific (WNP), and real-time rolling bias corrections were made for 10-day wind-field forecasts released by the EC between December 2020 and November 2021, divided into four seasons. Compared with the original EC forecasts, after correction using the MT-DETrajGRU model the wind speed and wind direction biases in the four seasons were reduced by 8-11% and 9-14%, respectively. In addition, the proposed method modelled the data uniformly under different weather conditions. The correction performance under normal and typhoon conditions was comparable, indicating that the data-driven mode constructed here is robust and generalizable.
translated by 谷歌翻译
We propose an analysis in fair learning that preserves the utility of the data while reducing prediction disparities under the criteria of group sufficiency. We focus on the scenario where the data contains multiple or even many subgroups, each with limited number of samples. As a result, we present a principled method for learning a fair predictor for all subgroups via formulating it as a bilevel objective. Specifically, the subgroup specific predictors are learned in the lower-level through a small amount of data and the fair predictor. In the upper-level, the fair predictor is updated to be close to all subgroup specific predictors. We further prove that such a bilevel objective can effectively control the group sufficiency and generalization error. We evaluate the proposed framework on real-world datasets. Empirical evidence suggests the consistently improved fair predictions, as well as the comparable accuracy to the baselines.
translated by 谷歌翻译
微小的行动挑战的重点是理解现实监视中的人类活动。基本上,在这种情况下,活动识别有两个主要困难。首先,人类活动通常在远处记录,并以小分辨率出现,没有太多歧视线索。其次,这些活动是自然而然地以一种长尾分发的。很难减轻这种沉重类别失衡的数据偏见。为了解决这些问题,我们在本文中提出了一种全面的识别解决方案。首先,我们训练具有数据平衡的视频骨干,以减轻挑战基准中的过度拟合。其次,我们设计了双分辨率蒸馏框架,可以通过超分辨率知识有效地指导低分辨率的动作识别。最后,我们将模型融合到后处理中,这可以进一步增强长尾类别的每种形式。我们的解决方案在排行榜上排名第一。
translated by 谷歌翻译
尽管使用多个无人机(UAV)具有快速自主探索的巨大潜力,但它的关注程度很少。在本文中,我们提出了赛车手,这是一种使用分散无人机的舰队的快速协作探索方法。为了有效派遣无人机,使用了基于在线HGRID空间分解的成对交互。它可确保仅使用异步和有限的通信同时探索不同的区域。此外,我们优化了未知空间的覆盖路径,并通过电容的车辆路由问题(CVRP)配方平衡分区到每个UAV的工作负载。鉴于任务分配,每个无人机都会不断更新覆盖路径,并逐步提取关键信息以支持探索计划。分层规划师可以找到探索路径,完善本地观点并生成序列的最小时间轨迹,以敏捷,安全地探索未知空间。对所提出的方法进行了广泛的评估,显示出较高的勘探效率,可伸缩性和对有限交流的鲁棒性。此外,我们第一次与现实世界中的多个无人机进行了完全分散的协作探索。我们将作为开源软件包发布实施。
translated by 谷歌翻译
近年来,移动机器人变得雄心勃勃,并在大规模场景中部署。作为对环境的高级理解,稀疏的骨骼图对更有效的全球计划有益。当前,现有的骨骼图生成解决方案受到了几个主要局限性,包括对不同地图表示的适应性不佳,对机器人检查轨迹的依赖和高计算开销。在本文中,我们提出了一种有效且柔性的算法,该算法生成轨迹独立的3D稀疏拓扑骨架图,捕获了自由空间的空间结构。在我们的方法中,采用了有效的射线采样和验证机制来找到独特的自由空间区域,这有助于骨架图顶点,并且在相邻的顶点作为边缘之间具有遍历性。周期形成方案还用于维持骨架图紧凑度。基准测试与最先进的作品的比较表明,我们的方法在较短的时间内生成稀疏的图形,从而提供了高质量的全球计划路径。在现实世界中进行的实验进一步验证了我们在现实情况下我们方法的能力。我们的方法将成为开源以使社区受益的开源。
translated by 谷歌翻译